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Abstract 

In the highway problem, we are given an n-edge line graph (the highway), and a set of paths (the 
drivers), each one with its own budget. For a given assignment of edge weights (the tolls), the 
highway owner collects from each driver the weight of the associated path, when it does not 
exceed the budget of the driver, and zero otherwise. The goal is choosing weights so as to max- 
imize the profit. A lot of research has been devoted to this apparently simple problem. The 
highway problem was shown to be strongly NP-hard only recently [Elbassioni,Raman,Ray- 
'09]. The best-known approximation is 0(log n/ log log n) [Gamzu,Segev-'10], which improves 
on the previous-best 0(log n) approximation [Balcan,Blum-'06]. Finding a constant (or better!) 
approximation algorithm is a well-known open problem in network design. Better approxima- 
tions are known only for a number of special cases. 

In this paper we present a PTAS for the highway problem, hence closing the complexity 
status of the problem. Our result is based on a novel randomized dissection approach, which 
has some points in common with Arora's quadtree dissection for Euclidean network design 
[Arora-'98]. The basic idea is enclosing the highway in a bounding path, such that both the 
size of the boimding path and the position of the highway in it are random variables. Then we 
consider a recursive 0(l)-ary dissection of the bounding path, in subpaths of uniform optimal 
weight. Since the optimal weights are unknown, we construct the dissection in a bottom-up 
fashion via dynamic programming, while computing the approximate solution at the same 
time. Our algorithm can be easily derandomized. 

We demonstrate the versatility of our technique by presenting PTASs for two variants of the 
highway problem: the toUbooth problem with a constant number of leaves and the maximum- 
feasibility subsystem problem on interval matrices. In both cases the previous best approxima- 
tion factors are polylogarithmic [Gamzu,Segev-'10,Elbassioni,Raman,Ray,Sitters-'09]. 
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1 Introduction 



Consider the following setting. We are given a single-road highway, which is partitioned into 
segments by toUbooths. The highway owner fixes a toll for each segment. A driver traveling 
between two toUbooths pays the total toll of the corresponding segments. However, if the total 
toll exceeds the budget of the driver, she will not use the highway (e.g., she will take a plane). Our 
goal is maximizing the profit of the highway owner. To that aim, we need to compromise between 
very low tolls (in which case all the drivers take the highway, but providing a small profit) and 
very high tolls (in which case no driver takes the highway, and the profit is zero). It is not hard 
to imagine other applications with a similar nature. For example, the highway segments might be 
replaced by the links of a (high-bandwidth) telecommunication network. 

The highway problem formalizes the scenarios above. We are given an n-edge line graph 
G = {V, E) (the highway), and a set D = {Di, . . . , Dm} of m paths in G (the drivers), each one 
characterized by a value bj € Q>o (the budgets). For a given weight function w : E ^ Q>o (the 
tolls) and a driver D, let w{D) := J2eeD ^(^) weight of Our goal is choosing w so as to 

maximize the following profit function: 

jMDj)<bj 

Despite the simplicity of its formulation and its clear relation to applications, there is a huge 
gap between known approximation and inapproximability results for the highway problem. The 
problem was shown to be strongly NP-hard very recently |10|. The best-known approximation 
factor is 0(Iogn/loglogn) |14| (see also 121). A quasi-polynomial-time approximation scheme 
(QPTAS) is given in [12]. This is a strong evidence of the existence of a PTAS for the problem. 
However, even finding a constant approximation is considered a challenging open problem in 
network design. For this reason, researchers focused on some relevant special cases l2ll5l [l5l[T8| . 

I. 1 Our Results and Techniques 

In this paper we present a deterministic polynomial-time approximation scheme (PTAS) for the 
highway problem, hence closing the complexity status of the problem. To achieve our goal, we 
exploit a novel randomized dissection approach. 

The basic idea is as follows. Let e > be a small constant. Via simple reductions (see Section 

II. 3|) . we can restrict ourselves to the case that optimal weights w*{e) are in {0, 1}, and that the sum 
W* of the optimal weights along the highway is polynomially bounded in the number n of edges. 
This introduces a 1 — B(e) factor in the approximation. 

The dynamic program is based on the following strategy. We consider all the subpaths P of 
the highway, and guess the value G {0, 1, . . . , W*} of the sum of the optimal weights along P. 
Note that the number of pairs (P, W) is polynomially bounded in n, due to the reductions above. 

We next restrict our attention to the drivers ^{P) := {D G P : DC P} which are entirely 
contained in P, with the goal of approximating the corresponding optimal profit: The table entry 
for (P, W) = {G, W*) will eventually give the desired approximate solution. 

lfW< W, for a fixed constant W, we simply guess the W edges where the optimum solution 
puts a weight of one. This provides the optimal profit for drivers in T>{P). Assume now that W > 
W. In this case, by considering all the possible partitions P = {Pi, . . . , P^} of P in 7 subpaths, 

^Throughout this paper we confuse graphs with their set of edges: the meaning will be clear from the context. 
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we can guess the partition where each Pi takes a I/7 fraction of the weight of P. Here 7 > 2 is 
a sufficiently large constant, depending on e. Observe that the set V^yiP) of such partitions has 
polynomial cardinality. Given P, for the drivers included into some Pi (i.e., in V{Pi)), we account 
for the (previously computed) profit of table entry (Pj, Ty/7). 

It remains to consider the profit of drivers T>{P) = V{P) — UJ^^P(Pj) which are contained in P, 
but not in any Pj. This is also the crux of our method. Each driver D G 'D{P) consists of a (possibly 
empty) subset of consecutive subpaths P^, P^+i, . . . , Pr, plus two (possibly empty) subpaths Pieft 
and Pright, with Pieft C P^-i and Pright C Pr+i- Observe that, if the budget of D is not exceeded, 
then each middle subpath Pj, z G {£,... , r}, contributes with an additive term W/^ to the profit of 
D. In particular, this is independent from the way the weight W/^ is distributed along Pj. 

The situation is radically different for the boundary subpaths Pieft arid Pright'- for them the 
profit can range from to VF/7, depending on the distribution of the weights along P£_i and 
Pr+i, respectively. In order to implement efficiently the dynamic programming step, we simply 
neglect the boundary subpaths. In other terms, we replace D with the shortened driver = 
D — {Pieft U Pright)- At this poiut, we simply account [r — i + l)W/j for the profit of D, if this 
quantity does not exceed its budget, and zero otherwise. This way we obtain the overall profit for 
drivers in P(P), and hence in T>{P). 

This approach has two opposite drawbacks: 

1. The profit computed might be too pessimistic. This is because we do not consider the profit 
coming from Pieft U Pright (in particular, it might be P> = Pieft U Pright, and hence D'' = 0). 

2. The profit computed might be too optimistic. In fact, it might happen that the weight along 

is below the budget of D, while the weight along D exceeds it (due to the weight on 
Pieft U Pright)- In that case we account for a positive profit, while the actual profit is zero. 

We solve the second problem by restricting our attention to good drivers D € 'D{P), i.e. drivers 
which contain U,{\/e) many subpaths Pj. It is then sufficient to scale down all the weights at the 
end of the process by a factor 1 — 0(e) to ensure that the budget of good paths is not exceeded. 

Observe that this does not solve the first problem: indeed, it makes it even worse (since we 
consider less drivers, besides shortening them). At this point, randomization comes into play. We 
initially enclose the highway in a hounding path. Both the length (i.e., the number of edges) of the 
bounding path and the position of the highway in it are random variables. To this instance we 
apply the approach above. Consider a driver D which contributes to the optimal profit. For a 
proper choice of the random variables, with probability 1 — 0(e), D is considered in the dynamic 
program for a path P of weight W such that the profit of D is much larger than W/^. Hence D is 
good with probability close to one. This introduces a factor 1 — 0(e) in the approximation ratio. 

As we will see, the domain of the random variables has polynomial size. Hence, the algorithm 
above can be easily derandomized by considering all the possible realizations. 

We believe that our technique will find other applications, and hence it might be of indepen- 
dent interest. In order to motivate that, we show how to apply it to two related problems (see 
Section lU: 

• The tollbooth problem is the generalization of the highway problem where the input graph G 
is a tree (rather than a line). This problem is APX-hard, and the best-known approximation 
for it 0(log n/ log log n) [14J. Here we present a PTAS for the practically-relevant special case 
that G has a constant number of leaves. 
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• In the maximum-feasibility subsystem problem we are given a set of vectors oi , . . . , G Q" 
and a set of m pairs {£j,Uj), with < £j < uj and j = 1, . . . , m. The goal is computing a 
vector w € Q>q such that the constraint £j < aJw < uj is satisfied by the largest possible 
number of indexes j. Intuitively, the vectors aJ can be interpreted as the rows of a matrix A: 
the product Aw G Q™ is what we wish to upper and lower bound. In this paper we restrict 
to the case that the vectors aj have entries in {0, 1}, and the I's appear consecutively (i.e., A 
is an interval matrix). 

Elbassioni, Raman, Ray, and Sitters fTT] show that this problem is APX-hard. Moreover, if 
we allow a violation of the lower and upper bounds by a factor (1 + e), then there is a poly- 
logarithmic approximation algorithm running in polynomial time, and an exact algorithm 
running in quasi-polynomial tim^. Here we show how to obtain a (1 -|- e) approximation in 
polynomial time in the same framework. 

1.2 Related Work 

The highway problem was even not known to be NP-hard until recently. For example, this is 
posed as an open problem by Guruswami et al. ||l5l . Weakly NP-hardness was shown via a 
reduction from partition by Briest and Krysta (5l. Very recently, Elbassioni, Raman, and Ray 
UTOll proved strongly NP-hardness via a reduction from max-2-SAT. Balcan and Blum [2J give a 
0(log n) approximation for the problem. Their algorithm partitions the paths in groups of differ- 
ent length. Then it applies a constant factor approximation algorithm in 1.15.1 for the rooted version 
of the problem, where all drivers contain a given node, to each group separately. The approxima- 
tion was very recently improved to O (log n/ log log n) by Gamzu and Segev |T4| . Their algorithm, 
which also works for the more general toUbooth problem, combines the notion of tree separators 
with a generalization of the algorithm for the rooted case mentioned before. The QPTAS by Elbas- 
sioni, Sitters, and Zhang [12J exploits the profiling technique introduced by Bansal et al. Q. The 
basic idea is guessing the approximate shape of the cumulative weights to the left and right of a 
given edge. This allows one to partition the problem into two sub-problems, which can be solved 
recursively. 

There are better approximation results, all based on dynamic programming, for a number of 
special cases. In [2J a constant approximation is given for the case that all the paths have roughly 
the same length. An FPTAS is described by Hartline and Koltun [TSl for the case that the highway 
has constant length (i.e., n = 0(1)). This was generalized to the case of constant-length paths in 
HTSl . In HTSlI the authors also present an FPTAS for the case that budgets are upper bounded by a 
constant. An FPTAS is also known ||2l|5l for the case that paths induce a laminar familjU. 

The tollbooth problem is the generalization of the highway problem where G is a tree. A 
O(logn) approximation was developed in |10|. As already mentioned, this was very recently 
improved to O (log n/ log log n) ||T4| . The tollbooth problem is APX-hard [IS], and for general 
graphs it is APX-hard even when the graph has bounded degree, the paths have constant length 
and each edge belongs to a constant number of paths |51. 

The highway and tollbooth problems belong to the family of prizing problems with single- 
minded customers and unlimited supply. Here we are given a set of customers: Each customer 

^The latter result is not a contradiction, since we compare to the optimum solution, which may not even slightly 
violate the inequalities 

^In a laminar family of paths, two paths which intersect are contained one in the other. 
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wants to buy a subset of items (bundle), if its total prize does not exceed her budget. In the highway 
terminology, each driver is a subset of edges (rather than a path). For this problem a 0(logn + 
logm) approximation is given in |15|. This bound was refined in |5| to 0(logL + logi?), where 
L denotes the maximum number of items in a bundle and B the maximum number of bundles 
containing a given item. A 0{L) approximation is given in O. On the negative side, Demaine et 
al. [9J show that this problem is hard to approximate within log'^ n, for some d > 0, assuming that 
NP 2 BPTIME(2"") for some e > 0. 

The highway problem has some aspects in common with the well-studied unsplittable flow 
problem on line graphs. In this problem we are given a line graph G = {V, E), with edge capaci- 
ties and a set of paths Dj, each one characterized by a demand and a profit. The goal is selecting 
a maximum profit subset of paths such that the sum of the demands of selected paths on each 
edge does not exceed the corresponding capacity. For the special case of unit capacities and de- 
mands, a (2 + e) approximation is given by Calinescu et al. [6J, improving on ||4l[l9l. Under the 
no-bottleneck assumption, the same approximation guarantee is achieved for the general case by 
Chekuri, Mydlarz, and Shepherd [8J, improving on an earlier constant approximation under the 
same assumption [7j- Eventually, a QPTAS is given in [3J. The QPTAS for the highway problem 
in |T2| exploits the same basic technique as in (3l. Our hope is that, in turn, our PTAS for the 
highway problem will inspire a PTAS for the line-graph unsplittable flow problem. However, this 
seems to require some new ideas and we leave it as a challenging open problem. 

For general 0/1-matrices, the maximum-feasible subsystem problem (with no violation) is not 
approximable within r2(n^/'^^^) for any e > even for £j = uj, unless ZPP = NP [llj. If each row 
of A contains 3 non-zero arbitrary coefficients, then even n^"*^ approximations are not possible 
in polynomial time |16| (see also the previous hardness result |13ll ). The best-known 0{n/ log n) 
approximation for this problem is due to Halldorsson | jl7| . 

The technique behind our PTAS resembles Arora's quadtree dissection for Euclidean network 
design [IJ. The basic idea there is enclosing the set of input points into a bounding box, then 
recursively partition it in a constant number of boxes. This dissection is then randomly shifted. 
On the resulting random dissection, one applies dynamic programming. We similarly enclose 
the highway in a bounding path, and partition the latter. Like in Arora's approach, the dissec- 
tion is randomly shifted. Differently from that case and crucially for our analysis, the size of the 
bounding path is a random variable as well. Another major difference is that the dissection is 
not uniform with respect to input properties, but with respect to the optimal weights: for this 
reason the dissection is constructed in a bottom-up, rather than top-down, fashion via dynamic 
programming (while computing the approximate solution in parallel). 

1.3 Preliminaries 

Let OPT = (w* ,1)*) be the optimum solution, where w* is the optimal weight function and V* 
is the set of drivers Dj such that w*{Dj) < bj. By opt we denote the optimal profit. Our PTAS 
starts with a sequence of rounding steps to transform the input (and the optimum solution) in 
a convenient form, while losing only a factor 1 — 2e in the approximation. Since these steps are 
rather standard, we discuss them here, while in Section |2] we will focus on the novel techniques 
introduced in this paper. 

W.l.o.g. we assume l/(2e) G N and e < 1/2. Let 5max be the largest budget. After scaling all 
budgets, one has 6max = m/e^. Observe that trivially opt > 6max- First, we discard all drivers with 
a budget smaller than 1/e. Next, we round down the budgets to the nearest integer. Any solu- 
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tion to the rounded instance gives a feasible solution of the same value for the original instance. 
Moreover, the optimal solution to the rounded instance is a good approximation of the original 
optimum. In fact, {w, V) with V := {Dj G V* \ bj > 1/e} and w{e) := -^y+F ^ feasible solution to 

the new instance since ^, s , 

/ ^ N \ - w (e) 0,- , , , 



for any Dj € V* with bj > 1/e. The profit of this solution is 

l+e~l+ee 

Dj£V Dj&V:bj>l/e 

As observed in [7], the optimal weights for this instance can be assumed to be integral. In fact, 
given the optimal drivers V* , the corresponding optimal weights w* can be computed by solving 
an ILP whose 0-1 constraint matrix is totally unimodular. Since the largest weight irvw* is trivially 
not larger than the largest budget (i.e. m/e^ after rounding), we can conclude that w* : E ^ 
{0, 1, . . . , m/e^}. By replacing each edge with a path of length m/e^, we can further assume w* : 
E {0, 1}. Let W* = EeGi=;^*(e) be the total weight of the solution, and 7 = By adding 

iy*7 dummy edges (not crossed by any driver), say, to the right of the highway, we can assume 
that W* = 7^ for some integer I (in fact, the weight assigned to dummy edges is irrelevant). 
Observe that W* < n mj/e'^: hence we can guess the value of W* in polynomial-time. 

We call an instance of the highway problem with the properties above ivell-rounded. The dis- 
cussion above implies the following lemma. 

Lemma 1. For any e > 0, there is a polynomial reduction from the highway problem to the same 
problem on well-rounded instances which is approximation-preserving modulo a factor {1 + s). 



2 A PTAS for the Highway Problem 

From the discussion in Section [L3l we assume that the input instance is well-rounded. Let e > 
be a constant parameter, 6 = l/(2e) € N and 7 = Our PTAS hptas for the highway 

problem is described in Figure [TJ 

In the Bounding Phase (B), we first guess the total optimal weight W* (Step Bl). By guessing, we 
mean that we run the rest of the algorithm for every feasible choice of W* (which is a polynomially 
bounded integer). Then, we enclose the highway in a bounding path (Step B2). Both the length of 
the bounding path and the position of the highway are proper functions of two random variables 
X and y. All the probabilities and expectations in this paper are with respect to the choice of those 
two variables. 

In the Dynamic Programming Phase (D), we compute the almost optimal profit 0(P, W) which 
can be obtained from the drivers in P by placing M^-many I's along P. In the initialization step 
(Step Dl), we compute profits (f){P, (1/e)^) by brute force, considering all the (j^-,^'^'^j^)-many possi- 
ble ways to place (1/e)^ = 0(l)-many I's on the edges of P. In the dynamic programming step 
(Step D2), we consider the best partition P = {Pi, . . . ,P^} of P into 7 subpaths. The set of candi- 
date partitions is denoted by ^^^{P). We first add to (l){P, W) the profits 4>{Pi, Wf-j) for each i. Then 
we consider the good drivers Dj, i.e. the drivers in P which contain nj > 6 subpaths Pj. For each 
such driver, we increase (j){P, W) by the profit associated to the shortened driver D| = Up.cDj -Pi/ 
i.e. W/j ■ Uj, unless this quantity exceeds the budget bj. 



5 



Figure 1 PTAS for the highway problem. JHIere () := 1/(25) e N and -^i = (1/; 
Input: Well-rounded highway instance G = (V, E) and {Pj,bj), j = 1,2, . . . ,m. 
Output: Edge weights w : E ^ Q>o 
Algorithm: 

(B) Bounding Phase: 

(Bl) Guess the value of the total weight W* = 7^, ^ g N. 

(B2) Choose integers x € {1,2, ... , W*} and y e {1,2, . . . , 1/e} uniformly at random. Attach 
a path of length W* ■ ((1/e)^ -l)-x (resp., x) to the right (resp., left) of G. Let Go be 
the resulting line graph, and W' = W* ■ 

(D) Dynamic Programming Phase: 

(Dl) For every path P C Go, 



<t>{P,{l/e)y)= max ^(^^O" 



w(P)=(l/e)y '^J-J'' 

(D2) For every path P C Gq, and for W = W'/-i'i, g = £ - 1, £ - 2, . . . , 0, 

</.(P,Ty)=_max \ y^ct>{Pi,Wh)+ V ■ % 

nj:=\{i:PiCDj}\>5, 
Wh-ni<bj 

(S) Scaling Phase: 

(51) Derive w' : Gq ^ {0, 1} determining the value of ^(Gq, PF')- 

(52) Output w: E ^ Q>o, where w{e) = w'{e) ■ g^. 



In the final Scaling Phase (S), we derive from the dynamic programming table the weights w' 
determining the value of (^{Gq, W) (Step SI). Then we restrict our attention to the edges of the 
(original) highway, and scale the corresponding weights down by (Step S2). 

3 Analysis 

To avoid any confusion, let n and n denote the number of edges in the original and well-rounded 
instance, respectively. Recall that, for any constant e, n is polynomially bounded in n and m. 

Lemma 2. Algorithm hptas runs in polynomial time. 

Proof. Since W* is an integer bounded by nm7 / e^, its value can be guessed by trying a polynomial 
number of values. For all the O(n^) choices of P in Step Dl, the number of candidate functions w 
to be considered is 0{n^^/^'>^). In Step D2, for all the O(n^) choices of P, there are 0(n^''^) possible 
choices for the P/s. The claim follows. □ 

In the rest of the analysis we consider only the run of the algorithm where VF* is guessed 
correctly. The next lemma shows that the profit apx of the finally returned solution, essentially 
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coincides with the value apxo = 4>{Go, W), that we obtain by dynamic programming. Here we 
crucially exploit the fact that we only consider (good) drivers Dj with large nj. 



Lemma 3. apx > j^^apxD. 

Proof. Let w' and V be the weights and the set of drivers determining apxo. Consider the corre- 
sponding dissection, and let nj = \{i Pi Dj}\ and Dj = UpjCZ) defined with respect to 
that dissection for each Dj . 

For any G V^nj > S = l/(2e) andw'{Dj) = W/j-rij < bj. The difference in weight between 
Dj and Dj only lies in the two sub-intervals owning the endings of Dj, and hence w'{Dj) < 
w'{Dj) < ^{rij + 2). It follows that wiDj) = -^w'{Dj) < -^w'{Dj) < Uj^ < bj. Hence, 
Dj contributes to apx with a profit w{Dj) > j^w' (Dp = -^w'{D'j). The claim follows smce 
apx > T.D,eV' w{Dj) > Ed,g©' w'{D'^) = j^apxD- □ 

It remains to lower bound apxD in terms of opt. In order to simplify the analysis, suppose that 
we are given an oracle which, for a given -P C Go with w*{P) = W = W j^^, q < I, produces 
a partition P* = {Pj*, . . . ,P^} such that w*{P*) = W/j. Also assume that we remove all the 
drivers but the ones T>* in the optimal solution. Consider the variant of Step D where we apply 
recursively the following Bellman equation 

7 

cl>'iP,W) = ^cl>'iP*,W/j)+ J2 W7-n,-, 

i=l V*BDjCP, 

nj: = \{i:P*CDj}\>S, 
Wh-nj<bj 

until W = {\/e)y , in which case we use brute force to compute the optimal weights like in Step 
Dl. It is not hard to see that apxp ■= 0'(Go, W) is a lower bound on apx£). 

Corollary 4. apx d > apxo • 

Hence it is sufficient to lower bound apxo- The value apxo is associated to a unique optimal 
dissection. With the same notation as in the proof of Lemma HI we let, for a given driver Dj, rij and 
Dj be defined with respect to the optimal dissection. We next say that a subpath in the optimal 
dissection is at level q e {0, 1, ...,£} if its optimal weight is W /j'^. Similarly, we say that a driver 
Dj is at level q in the optimal dissection if it is contained in a subpath of level q, but not q + I. 

Let aq = W 1^'^. Consider any driver Dj G V* , with a^+i < w*{Dj) < Ug. We call Dj good if it 
is at level i in the dissection, or it is at level q < £ and it contains at least 6 subpaths of level q + I 
(i.e., Uj > 5). 

Observe that good drivers Dj contribute to the value of apxo with a profit w*{Dj) > w*{Dj) ■ 
= ■ w*{Dj). Hence, it is sufficient to show that a given driver in T>* is good with proba- 
bility close to one. 

Lemma 5. Each driver Dj € V* is good with probability at least 1 — 3e. 

Proof. Let us upper bound the probability that a driver Dj is bad (i.e., not good). We say that driver 

Dj is risky if 

3q : eaq < w*{Dj) < -aq. 
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Figure 2 Log-scale axis. The regions of risky weights are gray shaded. 
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Consider a log-scale axis and term tick the distance that corresponds to a factor of l/e. Then 
consecutive a^'s have a distance of l/e ticks to each other (see Figure |2]). The region of risky 
weights w.r.t. a specific is the interval ] /e[, hence on the log-scaled axis it is an (open) 

interval of 2 ticks length. The random choice of y yields that all a^'s are simultaneously shifted by 
yG{l,...,l/e} ticks to the right. Hence for each value of w*{Dj) at most 2 out of l/e choices of y 
cause that Dj is risky: 

Pr[Dj is risky] < 2e. 

Next condition on the event that Dj is not risky. Suppose Dj is not at level £, otherwise there is 
nothing to show. Observe that there is a g with 

-dq < w*{Dj) < eaq^i. 

Then deterministically Dj contains at least l/e — 1 > l/(2e) = 6 many level q subpaths. Since the 
random shift x is chosen uniformly at random from {1, . . . , W*} and W* is a multiple of we 
furthermore have 

w*(D) 

PilDj is at level < q] < — < e. 
Applying the union bound, we obtain that driver Dj is bad with probability at most 3e. □ 
Corollary 6. E[apxo] > j^opt. 
Proof. By linearity of expectation 

E[apxo]>E[ Y: ^*(Dj)]^TTI-e^[ ^ ^ TtI ^ n;*{D,) = '—^opt. □ 

DjGV*, DjGV, Dj&V 

Dj good Dj good 

Now we have all the ingredients to prove the main result of this paper. 
Theorem 7. There is a randomized PTAS for the highway problem. 

Proof. Consider the randomized algorithm which first transforms the input in a well-rounded 
instance as described in Section [T31 and then applies algorithm hptas. From Lemmas [l] and 121 
this algorithm takes polynomial time. By Lemma [H Lemma IH Lemma IH and CoroUary [6l the 

approximation ratio of the algorithm is ■ □ 

The PTAS in Theorem[7]can be derandomized by considering all the (polynomially many) choices 
of X and y in Step B2. 

Corollary 8. There is a deterministic PTAS for the highway problem. 



''Except of the case when a^-i — W' , but then deterministically the driver Dj cannot cross the boundary. 
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4 Extensions 



In this section we extend our approach to two variants of the highway problem. 

4.1 ToUbooth with a Constant Number of Leaves 

We next sketch a PTAS for the tollbooth problem, when the input graph G is a tree with a constant 
number ^ = 0(1) of leaves: details are given in Appendix|Al Recall that the problem is APX-hard 
when the number of leaves is arbitrary. 

By the same arguments as in the highway case, we assume that optimal weights w* are 0/1- 
valued, and that their sum W* is bounded by a polynomial in n. We choose an arbitrary leaf s{G) 
of G as a source, and call the other leaves sinks. Analogously, given any subtree T of G, we call 
the leaf s{T) of T which is closest to s{G), the source of T. The other leaves of T are called sinks of 
T. By appending a path of length W*j to s{G), we can assume that the total weight along each 
source-sink pair isW:=j^ for some integer £. The resulting instance is well-rounded. 

Imagine to split G at any node whose i(;*-distance from s{G) is an integer multiple of W/j. 
In such a way we obtain a forest T = {Ti, . . . ,Tq} of subtrees with the following property: any 
source-sink path in Tj has weight W /^. We iterate this process until the total weight which has to 
be installed on the subtree reaches a constant value. We call this dissection optimal. 

Consider a driver Dj and let T be the smallest subtree in the optimal dissection that fully 
contains Dj. Suppose W = 1^/7"^ is the weight that w* installs on any source-sink path of T. 
Let T = {Ti, . . . ,Tg}he the partition of T in the optimal dissection. We say that Dj crosses Ti if 
it contains exactly one source-sink path of Tj. We say that driver Dj is good if the number rij of 
crossed subtrees is at least a large constant ^ := Also in this case, we can define a shortened 
driver Dj = Utj crossed by D ^ ^j)- However note that in this case might consist of two 
disjoint paths. (In particular, this might happen if Dj does not lie along a source-sink path of G). 

Analogously to the highway case, it is sufficient to show that the profit coming from shortened 
drivers is large with respect to the optimal dissection. Then for subtrees T of the instance and 
weights W , we compute table entries (f){T, W) giving the optimum profit that can be obtained 
from the shortened paths of good drivers Dj C T, in such a way that on each path from s{T) to 
any other leaf of T one installs a total weight of W . 

Theorem 9. There is a deterministic PTAS for the tollbooth problem with a constant number of 
leaves. 

4.2 Maximum-Feasible Subsystem for Interval Matrices 

In this section we sketch a multi-criteria PTAS for the maximum-feasible subsystem problem on 
interval matrices (MaxFS). More precisely, we show the slightly more general statement: 

Theorem 10. Given a matrix A G {0, 1}™^" with rows oi , . . . , having consecutive ones, weights 
vi,...,Vm^ Q>o and integer bounds < £j < Uj,j = 1, . . . , m. Letopt = max^>o{X;j.^^<aJu.<«, ^il 
Then for every fixed e > one can compute deterministically in polynomial time in n, m and 
log max{£j}^ a weight function w >Q and a set J Q {I, . . . , m} such that '^j ^ (1 ~ £)opt and 
ij < ajw < {1 + e)uj for all j G J. 
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By standard arguments, one can round the profits vj such that they become integers between 
and m je. Then each constraint j can be replaced by Vj many constraints with unit profit. Choosing 
e accordingly smaller and scaling the weight function by 1 + 0{e), it suffices to find a solution w 
that satisfies opt /{I + 0(e)) many constraints approximately, i.e. + < a^w < Uj{\ + 

0{e)). 

It is maybe easier to think of MaxFS as a variant of the highway problem where: (1) the 
consecutive I's in each row j define a driver Dj in a line graph G = (F, E), (2) each driver Dj, 
besides having a budget hj = Uj, also has a minimum amount of money £j that she wants to spend, 
and (3) the goal now is maximizing the number of satisfied drivers who take the highway (rather 
than maximizing the profit). Here w can be interpreted as a vector of weights. 

Let OPT = {w* , P*) be the optimal solution and define W* := Xlees ^* (e)- Abbreviate ^max := 
maxj^j I j = 1, . . . ,m}. Observe that w.l.o.g. w*{e) < imax on all edges. Hence, W* < n ■ Imax- 
Since interval matrices are totally unimodular, we can also assume that w*{e) € Z>o for all e £ E. 
By adding a dummy edge to the left of the line graph (i.e., a zero column to the left of the matrix), 
we can assume that W* = for some £ G N. We also attach a dummy edge to the right of 

the graph. 

Furthermore recall that for the highway PTAS we duplicate edges in order to obtain 0/1 
weights. The goal is guaranteeing that we can partition the total optimal weight in 7* pieces, 
i = !,...,£, without splitting any edge. This is not possible here due to the fact that optimal 
edge weights are not necessarily polynomially bounded. However, it is sufficient to duplicate 
each edge 7 • £ • m times to achieve the same goaH (see Appendix [B] for a proof). Altogether, we 
obtain a well-rounded instance Go with the following properties: (1) between any two nodes that 
are starting point or end point of some driver, one has at least j ■ £ ■ m edges; (2) the weight of the 
optimal solution is a power of (1/e)^/^; (3) at both endings of the highway we have 7 • £ • m many 
edges that are not used by any driver. 

Our algorithm applies for such well-rounded instances and begins by guessing W*. Since W* 
is a power of (1/e)^/^, there are at most a polynomial number of candidate values. Recall that the 
randomization in the algorithms before was used to create a new probabilistic optimal solution. 
The careful reader might have noticed that the random choice of x can also be moved to the 
analysis. To simplify a later derandomization, in the algorithm we only choose y € {I, . . . ,l/e} 
uniformly at random and approximate a solution that installs a total weight of W = (1/e)^ • W* 
on the edges. 

For any subpath P C Gq, we compute table entries 4>{P, W) over all weight assignments w : 
P — Z>o, with ■w{P) = W, and over all possible dissections of P, with the goal of maximizing the 
number of drivers Dj such that: (1) Dj is fully contained in P, (2) Dj is good in the same sense 
as in the highway case, and (3) ij/{l + 4e) < w{Dj) < Uj (the shortened driver is approximately 
satisfied). The number of such table entries is bounded by a polynomial in n, m and log imax, since 
we only consider values W, which are of the form W' f^. 

Eventually we output the solution {w,V') that attains the value (/'(Go, W). Using the argu- 
ments in Lemma |5] and Lemma [6l one can show that E[(I){Gq,W')] > (1 — 3e)opt. Similar to 
LemmaHl one has £j/(l + 4e) < w{Dj) < Uj{l + 4e) for any Dj G V. Theorem [13 then follows 
(see Appendix [B] for more details). 



^The same approach can be used in the highway problem as well, though it is not crucial to obtain a polynomial 
running time in that case. 
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A Tollbooth with a Constant Number of Leaves 

A detailed description of the algorithm is given in FigurelH By Vj (T) we denote the set of potential 
dissections of subtree T into a forest T = {Ti, . . . , T^^y^}. Observe that each source-sink path of 

T can contain at most 7 — 1 break-points. Consequently, the number q{T) of subtrees in each 
candidate forest is at most 7 • 0. It follows that the cardinality of VjiT) is polynomially bounded 
when the number 6 of leaves of G is constant. 

Proof. (Theorem (9]) Consider the randomized algorithm described in Figure ID this algorithm can 
be derandomized by considering all the possible values of random variables x and y. Assume 
< e < I without loss of generality. 

Like in the highway case, let us restrict our attention to the dissection corresponding to the op- 
timal weights, and let us discard drivers which do not provide any profit in the optimal solution. 

We start by showing that any residual driver Dj is good with probability at least 1 — 3e. Let us 
call a driver Dj straight if it lays along a source-sink path of G, and bent otherwise. By exactly the 
same argument as in the highway case, a straight path is good with probability at least (1 — 3e). 
Hence consider a bent driver Dj, and let Dj and Dj be the two straight subpaths which partition 
Dj. Paths D'j and Dj have a common endpoint, which is the node of Dj which is closest to 
the sink of G. Without loss of generality, w*{D'-) > w*{D'-). With the same notation as in the 
highway case, and by a similar argument, with probability at least 1 — 2e, there is a g such that 
jOiq < w*{Dj) < euq^i. When this happens, D'^ is at level q in the dissection with probability at 
least \ — e. Conditioning on the latter event, by the way the dissection is constructed and being 
w*{Dj) < w*{Dj), Dj is at level not smaller than q in the dissection. This implies that Dj is at 
level q as well. We can conclude that Dj crosses at least ^ — 4 > ^ = (5 many level q subtrees. The 
—4 here comes from the fact that the portion of Dj not crossing any subtree consists of at most 4 
source-sink subpaths (2 for D'j and 2 for Dj, if Dj is bent). Altogether, Dj is good with probability 
at least 1 — 3e. 

Given that Dj is good, the portion of Dj crossing subtrees at level q + I has weight at least 
-g^w*{Dj). This is by the same argument as above. Furthermore, the budget of Dj in the dy- 
namic program is violated at most by a factor : hence scaling the weights by in Step (S2) 
guarantees that good paths contribute to the actual profit. Considering that the initial rounding 
introduces a factor 1 + e in the approximation, altogether the solution produced by the algorithm 
gives profit at least (^)^ • ^f^opt = (i+ggTln +g) opt in expectation. □ 
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Figure 3 PTAS for the toUbooth problem for a constant number of leaves. Here 6 := l/(2e) G N 

and 7= 

Input: Well-rounded tollbooth instance G = {V, E) and {Dj,bj), j = 1, 2, . . . , m. 

Output: Edge weights w : E Q>o- 

Algorithm: 

(B) Bounding Phase: 

(Bl) Guess the value of the weight = 7^ ^ e N. 

(B2) Choose integers x G {1, 2, . . . , W} and y G {1, 2, . . . , 1/e} uniformly at random. Attach 
a path of length W ■ ((1/e)^ — 1) — x to each sink of G, and a path of length x to the 
source of G. Let Gq be the resulting tree, and W = W ■ (1/e)^. 

(D) Dynamic Programming Phase: 

(Dl) For every subtree T C Gq, 

^(T,(l/e)y)= max V w(DA. 

w{T)={l/e)y f^^^' 
w{Dj)<bj 

(D2) For every subtree T <ZGo,W = W'/j'^, and q = i - l,i - 2, . . . ,0, 

<A(T, W) = _max <^ V (T„ Vr/7) + V Ty/7 • rij 

nj:=\{i: Dj crosses Ti}[>5 

(S) Scaling Phase: 

(51) Derive w' : Go ^ {0, 1} determining the value of (p{Go, W). 

(52) Output w: E ^ Q>o, where w{e) = w'{e) ■ j^. 



B Maximum-Feasible Subsystem for Interval Matrices 

Recall that a driver Dj belongs to a path P in a dissection, if P is the maximal path with Dj C P. 
Suppose the driver Dj indeed belongs to P and the dissection splits P into P = {Pi ,P^}. Then 
Dj is termed good if the number of Pi's with Pi C Dj is at least S = 

The algorithm in Figure |4] computes table entries (piP, W) representing the maximum number 
of good drivers Dj C P that can be approximately satisfied under the constraint w{P) = W. 
The main difference to the previous algorithms is that, if we reach a path P not containing any 
driver Dj, then we define (^{PjW) = 0. First note that the number of table entries is bounded by a 
polynomial in n and log £max- Hence, the table entries can be computed in time poly{n, m, log £max) • 

Next, we argue why the value of the computed table entry is not much worse in expectation 
than the optimal number of satisfiable drivers. 
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Figure 4 PTAS with 1 + 0(e)-violation for the MaxFS problem. Here S := l/{2e) e N and 7 = 

Input; Well-rounded MaxFS instance Go = (V, E) and {Pj,ij,Uj), j = 1,2, . . . ,m. 

Output: Edge weights w: E ^ Z>o, drivers V C V. 

Al gorithm; 

(B) Bounding Phase; 

(Bl) Guess the value of the total weight W* = 7^, ^ G N. 

(B2) Choose y G {1, . . . , 1/e} uniformly at random. Define W = W* ■ {l/e)^. 
(D) Dynamic Programming Phase; 

(Dl) For every path P C Gq, 

4>{P,{l/e)y)= ^max^^ \{D^ Q P \ + As) <w{D^) <Uj}\. 

w{p)={m' 

For every path P with no Dj C P, define (j){P, 1^) = for any W = W'/^i, q = 
0,...,£-l. 

(D2) For every path P C Go, and for W = W'/ji, q = i - l,i - 2, . . . ,0, 

m W) = _max^^ I ± </> (P. Wh) + \{d,<ZP\ ,:;;rJtS¥-n]it } | } • 

(O) Output Phase: 

(01) Derive w: E ^ Z>o and V CD determining the value of (P{Gq,W'). 

(02) Output iw,V'). 



Lemma 11. The final table entry satisGes E[(j){Go, W')] > (1 - 3e)opt. 

Proof. Let w* : E ^ Z>o be the optimal weight function of total weight W*. Recall that we have 
inserted dummy edges to the left and to the right, not contained in any driver Dj. We choose an 
integer a; G {1,2,..., W*} uniformly at random. Then increase the total weight on the dummy 
edges to the right by W* ■ ((1/e)^ — 1) — x and the weight on the dummy edges to the left by x. The 
total weight of w* is now indeed W = W* ■ (1/e)^. It now suffices to show the promised bound 
on (Go, W')] over the random choices of y and x. 

Recall that for MaxFS we could not assume that all edges carry just unit weight. Hence we 
need to argue, why there still is a proper dissection induced by w*, when each edge is replaced by 
just 7-£-m many edge segments. To see this, imagine the line graph G*, which indeed emerges from 
replacing any edge e by w*{e) many edges. As in previous sections, there is a proper dissection 
induced by w* — potentially with an exponential number of leaves. We think of this dissection 
to be constructed in a top-down fashion, where the dynamic program truncates the dissection at 
empty paths, that do not contain any driver. How many paths (or nodes in the dissection tree) can 
this truncated dissection have? Any of the m drivers is fully contained in not more than i many 
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paths (which is the depth of the dissection tree). And any remaining empty path must have a 
father that is non-empty. Hence the number of paths P in the truncated dissection tree is bounded 
by 7 • £ • m. Since we replaced any edge in the original graph by that many edge segments, this 
truncated dissection also exists in Gq. 

Again, by Lemma |5j if we consider the (truncated) dissection of Go which is induced by the 
optimal solution, any driver Dj is good with probability at least 1 — 3e. Suppose that Dj is good 
and satisfied in the optimal solution, i.e. ij < w*{Dj) < uj. Then 

w*{Dj)>j^w*{Dj) >£,/{! + As) 

and of course w*{Dj) < w*{Dj) < uj. In other words, Dj would be included by the dynamic 
program. The claim follows again by linearity of expectation. □ 

Finally we argue that the returned drivers are approximately satisfied by the computed weight 
function. 

Lemma 12. Let {w, V) he the returned solution. For every driver Dj G D', one has ij/{l + 4e) < 
w{Pj) < Uj{l + Ae). 

Proof. Again let Dj be the shortened driver of Dj w.r.t. the dissection induced by the computed 
weight function w. First of all w{Dj) > w{Dj) > Ijj (1 + 4e). Next, the driver Dj is good, hence 

w[Dj) < —^w{D'j) = (1 + Ae)w{D'j) < (1 + 4e) • Uj. 

□ 

We observe that the above algorithm can be easily derandomized by trying out all l/e many 
choices of y. In total TheoremllOlfollows. 
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